Heuristic Methods for Reducing Errors of Geographic Named Entities Learned by Bootstrapping
نویسندگان
چکیده
One of issues in the bootstrapping for named entity recognition is how to control annotation errors introduced at every iteration. In this paper, we present several heuristics for reducing such errors using external resources such as WordNet, encyclopedia and Web documents. The bootstrapping is applied for identifying and classifying fine-grained geographic named entities, which are useful for applications such as information extraction and question answering, as well as standard named entities such as PERSON and ORGANIZATION. The experiments show the usefulness of the suggested heuristics and the learning curve evaluated at each bootstrapping loop. When our approach was applied to a newspaper corpus, it could achieve 87 F1 value, which is quite promising for the fine-grained named entity recognition task.
منابع مشابه
A Bootstrapping Approach for Geographic Named Entity Annotation
Geographic named entities can be classified into many subtypes that are useful for applications such as information extraction and question answering. In this paper, we present a bootstrapping algorithm for the task of geographic named entity annotation. In the initial stage, we annotate a raw corpus using seeds. From the initial annotation, boundary patterns are learned and applied to the corp...
متن کاملMicrosoft Word - camera-ready.docx
We explore methods for effectively extracting information from clinical narratives, which are captured in a public health consulting phone service called HealthLink. The currently available data consists of dialogues constructed by nurses while consulting patients on the phone. Since the data are interviews transcribed by nurses during phone conversations, they include a significant volume and ...
متن کاملBootstrapping Biomedical Ontologies for Scientific Text using NELL
We describe an open information extraction system for biomedical text based on NELL (the Never-Ending Language Learner) (Carlson et al., 2010), a system designed for extraction from Web text. NELL uses a coupled semi-supervised bootstrapping approach to learn new facts from text, given an initial ontology and a small number of “seeds” for each ontology category. In contrast to previous applicat...
متن کاملFactors Affecting Medication Errors from Nurses' Perspective: Lessons Learned
Introduction: Medical errors are among the most threatening faults against patient’s safety in all countries. The most frequent medical errors are medication errors which can lead to serious effects and even death in patients. Therefore, this study aimed to explain factors affecting medication eroors from the viewpoints of nurses in order to present strategies to reduce these errors. Methods:...
متن کاملA Bootstrapping Method to Assess Software Impact in Full-Text Papers
Introduction and Motivation There is a concerted effort to study science of science in multiple spheres. However, a clear gap exists in how to incorporate digital outputs, such as software, as an integral component in scholarly communication. This tension has become aggravated in recent years because software can be the end products in many scientific inquiries. Therefore, there is the need to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005